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Abstract 

This article discusses the determination of asymmetries. We consider a sample of 
events consisting of a peak of signal events on top of some background events. 
Both signal and background have an unknown asymmetry, e.g. a spin or forward- 
backward asymmetry. A method is proposed which determines signal and back- 
ground asymmetries simultaneously using event weighting. For vanishing asymme- 
tries the statistical error of the asymmetries reaches the minimal variance bound 
(MVB) given by the Cramer-Rao inequality and it is very close to it for large asym- 
metries. The method thus provides a significant gain in statistics compared to the 
classical method of side band subtraction of background asymmetries. It has the 
advantage with respect to the unbinned maximum likelihood approach, reaching 
the MVB as well, that it does not require loops over the event sample in the mini- 
mization procedure. 
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1 Introduction 



Asymmetries of cross sections, e.g. spin-asymmetries and forward-backward 
asymmetries, are often interesting physics quantities. For concreteness let us 
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consider a situation as shown in Fig. [T], where the asymmetry of the signal 
events, in the central Gaussian peak with a width of cr, should be determined 
from data taken in two different spin configurations. The number density of 
events as a function of some kinematic variable, x, (typically a reconstructed 
mass) is given by 

n±(x) = a{x){as{x) + as{x)) (l ± As—^^^^^ ± ^b^^^^^) 

\ as{x)+aB{x) as[x) + aB[x)J 

with = \{'^sB + b)- Here cr^ (erf) denotes the cross section of the 
signal (background) events in the two different spin configurations + and — . 
The factor a is a luminosity and acceptance factor, assumed to be the same for 
the two spin configurations. The goal is to determine from spectra as shown 
in Fig. [H and taken in two spin configurations, the two unknown asymmetries 

As = (o"s ~ ^s)K^s + ^s) ^^'^ Ab = (o"j3 — o'b)/{o-b + ^b)) assumed to be 
independent of x. It is of course not known event-by-event whether a particular 
event is signal or background; one only knows the fraction of signal events as 
a function of x, from a fit to the event spectrum as in Fig. [H 

Section [2] presents the simplest method, based on counting rate asymmetries. 
Section [3] describes the unbinned likelihood method which is known to yield 
the smallest possible variance of all unbiased estimators in the limit of an 
infinite number of events, thus reaching the minimal variance bound (MVB) 
given by the Cramer-Rao inequality. Section H] presents a new asymmetry 
estimator, based on weighted events. This estimator is also unbiased in the 
large limit, i.e. it is consistent, and it is very close to reach the minimal 
variance bound. The advantage is that it can also be used in cases where the 
unbinned likelihood method is cumbersome because of large number of events. 
Event weighting to extract the number of signal and background events was 
discussed in Ref. pQ but extraction of asymmetries is not discussed in this 
reference. The different methods are compared in section [51 



2 Estimator based on counting rate asymmetries 

A method often found in the literature pi3] is to determine the asymmetry in 
a /c-standard-deviation region around the peak, a region which includes both 
signal and background; then to measure the background asymmetry in some 
side bands around the signal peak {—kmaxO' < x < —kminO' and kminO' < x < 
kmaxc) and to use the result to correct the asymmetry measured in the peak 
region. For sake of simplicity we will set a = 1, so that everywhere below we 
can write k instead of ka. 

The expectation value of the counting rate asymmetry, A'^"*, in the range 
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Fig. 1. Example of signal events originating from a Gaussian distribution centered 
at x = and width a = 1 sitting on a constant background. 



-A; < x < A; is related to As and in the following way: 



A 



+ A 



where we used (A^^) = / n'^{x)dx and (A^~) = / n~{x)dx. An estimator for 
^5 is given by: 



A. 



J^k a{crs + crB)dx / _ J^^ aasdx 



J_f, aas dx 



J-k + crB)dx 



-A 



B 



(2) 



Note that, strictly speaking, the first equality in Eq. ([T]) is valid only in the 
large N limit. In this limit Eqs. ([1]) and ([2]) indicate that (As) = As, i.e. As 
is a consistent estimator. 

The corresponding figure of merit, F0M=1/(T^^, reads 



FOM 



Jl, aasdx V 1^2^ (/!,aaBdx 



/_fca(cr5 + aB)dx 



(3) 



Here and in the following we assume small asymmetries, such that for the error 
calculation the approximation {N~^) ^ is valid. In this case one finds 
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= J\[n+{x)+n-{x)]dx and l/al^ = J_^2">+{x) + n-{x)]dx + 
J^™.^"" (x) + n~{x)]dx. Introducing these values of a\cnt and in Eqs. [3] 
shows that the FOM depends on the choice of both the signal region [k) and 
the background region {kmm and kmax)- The solid line in Fig. [2] shows the FOM 
as a function of kmax-, for kmin = 3 which is a reasonable value to make sure 
that the side bands include a negligible amount of signal. The signal region, 
i.e. the value for k, is chosen in order to maximize the FOM for the given 
kmax- The FOM depends also on the signal to background ratio, here chosen 
to be 1:1 at a; = 0, as in Fig. [H 



3 Maximum Likelihood asymmetry estimators 



In the large limit, the unbinned maximum likelihood method is known to 
provide an unbiased estimator for the parameters As and As, which reaches 
the minimal variance bound. Since the numbers of events and A^^ are not 
fixed, an extended maximum likelihood method has to be used With the 
definitions Si = as{xi)/{as{xi) + asixi)), Bi = (TB{xi)/{as{xi) + asixi)) and 
Ui = a{xi) {<Js{xi) + (JB{xi)) the log likelihood function reads: 

/ = In £ = ^ In («,(1 + S,As + B,Ab)) - (N^) {As, Ab) 
1 

+ ^ In (a,(l - S^As - B^Ab)) - {N-){As, Ab) , 

2 

where Si (S2) runs over all events in the + (— ) configuration and in the range 
—kmax <x < kmax, whilc {N'^){As, Ab) = J n^(x)dx. The first derivative is 

^ = V - T - (A) 

OAs V 1 + S^As + B,Ab Y 1 - S,As - B,Ab ' ^ ^ 

with a similar expression for Ab- Note that the terms with (A^'*') and {N^) 
cancel each other because the same a is assumed for the two configurations. 
The set of equations dl/dAs^s = can be solved for As and Ab- 

For small asymmetries a first order expansion in As and Ab gives the set of 
equations 

(t.s-+J:s-] As+(j:s.b, + J2s,b) Ab = j:s,-J2s., 

\1 2 / \1 2 / 1 2 

f E S^B. + E S^B^ As + f E + E AB = Y.B^-Y.B^ (5) 

\l 2 / \1 2 / 1 2 
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and the covariance matrix of the two parameters As and Ab reads : 

/ Q2i \ ( 



GOV ^{As.Ab) = 
For the FOM of As one finds 



\ dAsdAB 



dAsdAB 
dA^ 



J2 Sf J2 SiBi 
y J2 SiBi J2 Bf 



(6) 



FOM = (1 - p2) ^ 52 ^i^j^ p 



SiBi 



(7) 



Note that, if not otherwise stated, all sums run over both event samples, 1 
and 2. 

The dotted line in Fig. [2] shows this FOM as a function of k^ax, i-e. for events 
in the region —kmax < x < kmax- For a given range of data available, defined by 
kmax, it is always larger than the FOM obtained with the side band subtraction 
method shown by the solid line. The latter method does not reach the minimal 
variance bound. 



4 Extracting the asymmetries using event weighting 



In this section a method to extract As (and simultaneously Ab) using event 
weighting is developed. It is clear that the estimator based on the counting 
rate asymmetry is not statistically optimal since it gives the same weight to 
all events. Better estimators can be obtained by weighting each event by the 
signal strength. Si, and by the background strength, Bi. These weight factors 
coincide with the optimal weights found in [1] to extract the number of signal 
events. They are used to build the following asymmetries: 

a — ~ a — ^* ~ (8) 

J2i Si + J22Si ' El Bi + E2 Bi 

In the large limit, the expectation values of as and as are 



, , , f aS'^dx , f aBSdx 

{as) = As \ +Ab \ , (9) 

, f aBSdx , f aB'^dx 

where a = a{x) {as{x) + aB{x)), as in section[3l The ratios of integrals can eas- 
ily be obtained from the event sample, e.g. / aS'^dx / J aSdx ~ J2i,2 Sf / Si, 
which results exactly in the set of equations found for the likelihood method 
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Fig. 2. FOM of As as a function of the maximum range of data available defined 
by kmax, for the classical method of side band subtraction (solid line) and for the 
likelihood or weighting method (dotted line). In the side band subtraction method 
kmin = 3 and for each value of kmax the value of k, defining the signal region, 
is chosen in order to maximize the FOM. The figures of merit are normalized to 
the maximum FOM reachable in the likelihood or weighting method in the limit 
kmax — > oo. In this case FOM = Yli2'^i- 



in the small asymmetry limit. So the FOM is still = (1 — P^) J2 Sf. This 

result can of course also be obtained directly, by simple error propagation us- 
ing the expressions found for As and Ab from Eqs. and ( ITOl) . Appendix lAl 
shows that the factor p is actually the correlation coefficient between J2 Si and 

This shows that the weighting method and the unbinned likelihood method are 
identical for small asymmetries. The advantage of the weighting method is that 
the estimators derived from Eqs. iQ and ffTOj) can also be used for arbitrary 
asymmetries, whereas the likelihood method requires in this case a numerical 
maximization of ln£ with loops over all events. For sake of simplicity, the 
error calculation was only presented for small asymmetries. Extending it to 
arbitrary asymmetries is straightforward but lengthy; it shows that the FOM 
of the weighting method is only slightly smaller than the FOM of the unbinned 
LH method. For example for a signal to background ratio as given in Fig. [1] 
and asymmetries smaller than 50% the decrease in the FOM is less than 1%. 
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The weighting method can also be extended to more comphcated cases where 
for example the acceptance factors a are not the same in the two spin con- 
figurations or even when the asymmetries have to be determined from four 
counting rates in order to cancel differences of acceptances and flux factors 
for the two spin configurations, as in Ref. [5]. 



5 Discussion of the results &; summary 



A comparison of the two curves in Fig. [2] shows that the FOM of the likelihood 
or event weighting method is always larger than the corresponding FOM for 
the classical method. For a signal-to-background ratio of 1:1 at x = 0, as in 
Fig. [H the gain is 23% for kmax = 4 and 7% for k^ax = 10. For kmax = 10 
the gain is 2% and 10% for a signal-to-background ratio of 10:1 and 1:10, 
respectively. Apart from the gain in statistics it should also be noted that the 
weighting method avoids the arbitrary choice of the background region which 
starts here at 3a. For Breit-Wigner distributions for example this choice is 
less obvious. 

In summary, a new set of two estimators was presented to determine simulta- 
neously signal and background asymmetries. These estimators are unbiased in 
the large N limit, i.e. they are consistent. For small asymmetries they are also 
efficient, i.e. they reach the minimal variance bound, like the statistically op- 
timal unbinned likelihood method. This is in contrast to the classical method 
of side band subtraction. These estimators can actually be derived from the 
likelihood method in the case of vanishing asymmetries. For large asymmetries 
their variances are still very close to the minimal variance bound. The advan- 
tage of the method is its applicability in cases where the likelihood method is 
cumbersome. 



A Derivation of the covariance matrix coY{as,aB) and correlation 
coefficient p 



Consider two weight factors S and B. The covariance between J2i Si and J2j Bj 
is given by: 
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* 3 
i j i j 

= (E s^B. + E s^B,) - (E s.) (E Bj) 

= {N){SB) + {N{N - 1)){S){B) - {Nr{S){B) 
= {N){SB) + i{N')-{N)-{Ny){S){B). 

If the number of events N is Poisson distributed, i.e. (A^^) — (A^) — (N)"^ = 0, 
one finds cov{J2iSi,J2jBj) = {N){SB) ^ J2iSiBi. The error on the sums 
of weights is given by cr| = cov(J2i Si, J2j Sj) = J2i Sf . Thus the correlation 
coefficient is 



^s^B x/E. Sf Bf 



(A.l) 
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